{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# Step 2: Model Building & Operationalization\n",
    "Using the training and test data sets we constructed in the 1st notebook`1_data_ingestion_and_preparation.ipynb`, in this notebook we will buid the model using a neural network type called LSTM network for scenario described at [Predictive Maintenance Template](https://gallery.cortanaintelligence.com/Collection/Predictive-Maintenance-Template-3) to predict failure in aircraft engines.\n",
    "\n",
    "Once trained, we will operationalize the model through the deployment of a web service using Azure container instance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "\n",
    "import os\n",
    "import json\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "from math import exp\n",
    "from common.utils import to_tensors\n",
    "from sklearn.metrics import (precision_score,recall_score,f1_score)\n",
    "\n",
    "from azureml.core import  (Workspace,Run,VERSION,\n",
    "                           Experiment,Datastore)\n",
    "from azureml.core.compute import (AmlCompute, ComputeTarget)\n",
    "from azureml.exceptions import ComputeTargetException\n",
    "\n",
    "from azureml.train.dnn import PyTorch\n",
    "from azureml.train.hyperdrive import *\n",
    "from azureml.widgets import RunDetails\n",
    "\n",
    "from azureml.core import Environment\n",
    "from azureml.core.model import InferenceConfig,Model\n",
    "from azureml.core.webservice import AciWebservice\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "PROJECT_DIR = os.getcwd()\n",
    "TRAINING_DIR = os.path.join(PROJECT_DIR, 'train')\n",
    "SCORING_DIR = os.path.join(PROJECT_DIR, 'score')\n",
    "EXPERIMENT_NAME = \"deep_predictive_maintenance\"\n",
    "CLUSTER_NAME = \"gpu-cluster\"\n",
    "ACI_SVC_NAME = 'predictive-maintenance-svc'\n",
    "\n",
    "print('SDK verison', VERSION)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Azure ML workspace"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ws = Workspace.from_config()\n",
    "print('Workspace loaded:', ws.name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data store\n",
    "\n",
    "We have previously created the labeled data set in the `Code\\1_Data Ingestion and Preparation.ipynb` Jupyter notebook and stored it in default data store of the AML workspace.\n",
    "\n",
    "Here, we call path method that returns an instance to [data reference](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.data_reference.datareference?view=azure-ml-py) which  will be passed to the training script during the run execution."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = ws.datastores['workspaceblobstore']\n",
    "data_path = \"data\"\n",
    "ds_path = ds.path(data_path)\n",
    "print(ds_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Compute target\n",
    "\n",
    "Here, we provision the AML Compute that will be used to execute training script"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    compute_target = ComputeTarget(workspace=ws, name=CLUSTER_NAME)\n",
    "    print('Found existing compute target.')\n",
    "except ComputeTargetException:\n",
    "    print('Creating a new compute target...')\n",
    "    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6',\n",
    "                                                           max_nodes=6)\n",
    "\n",
    "    # create the cluster\n",
    "    compute_target = ComputeTarget.create(ws, CLUSTER_NAME, compute_config)\n",
    "\n",
    "compute_target.wait_for_completion(show_output=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Modeling\n",
    "\n",
    "The traditional predictive maintenance machine learning models are based on feature engineering, the manual construction of variable using domain expertise and intuition. This usually makes these models hard to reuse as the feature are specific to the problem scenario and the available data may vary between customers. Perhaps the most attractive advantage of deep learning they automatically do feature engineering from the data, eliminating the need for the manual feature engineering step.\n",
    "\n",
    "When using LSTMs in the time-series domain, one important parameter is the sequence length, the window to examine for failure signal. This may be viewed as picking a `window_size` (i.e. 5 cycles) for calculating the rolling features in the [Predictive Maintenance Template](https://gallery.cortanaintelligence.com/Collection/Predictive-Maintenance-Template-3). The rolling features included rolling mean and rolling standard deviation over the 5 cycles for each of the 21 sensor values. In deep learning, we allow the LSTMs to extract abstract features out of the sequence of sensor values within the window. The expectation is that patterns within these sensor values will be automatically encoded by the LSTM.\n",
    "\n",
    "Another critical advantage of LSTMs is their ability to remember from long-term sequences (window sizes) which is hard to achieve by traditional feature engineering. Computing rolling averages over a window size of 50 cycles may lead to loss of information due to smoothing over such a long period. LSTMs are able to use larger window sizes and use all the information in the window as input. \n",
    "\n",
    "http://colah.github.io/posts/2015-08-Understanding-LSTMs/ contains more information on the details of LSTM networks.\n",
    "\n",
    "This sample illustrates the LSTM approach to binary classification using a sequence_length of 50 cycles to predict the probability of engine failure within 30 days."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Implementation and hyperparameters tuning\n",
    "\n",
    "Building a Neural Net requires determining the network architecture.\n",
    "\n",
    "In this scenario we will build the network using Pytorch framework as opposed to original sample that have used Keras. As such from the SDK, we use the pytorch [estimator](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-train-pytorch).\n",
    "\n",
    "Search for the best hyperparameters is achieved using [Hyperdrive](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-tune-hyperparameters)\n",
    "\n",
    "In the train directory, the listed below files are used as follow:\n",
    "\n",
    " - Utils.py: contains data preparation to read csv files and transform them into lstm ready 3D tensors.\n",
    " - network.py: contains LSTM network definition\n",
    " - train.py: training and evaluation code\n",
    " - entry.py: train and save model to storage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pytorch Estimator\n",
    "\n",
    "Here, we define the Pytorch estimator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "script_params = {\n",
    "    '--epochs': 2,\n",
    "    '--data_path': ds_path,\n",
    "    '--output_dir': './outputs'\n",
    "}\n",
    "\n",
    "estimator = PyTorch(source_directory = TRAINING_DIR, \n",
    "                    conda_packages = ['pandas', 'numpy', 'scikit-learn'],\n",
    "                    script_params=script_params,\n",
    "                    compute_target=compute_target,\n",
    "                    entry_script='entry.py',\n",
    "                    use_gpu=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Hyperparameters tuning using Hyperdrive\n",
    "\n",
    "Here, we define hyerdrive configuration, given the high cost associated with aircraft engine failure and how detrimental it is, we will tune and optimize our model for recall metric.\n",
    "\n",
    "For completness we will be tracking precision and F1 as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "param_sampling = RandomParameterSampling( {\n",
    "        'learning_rate':uniform(1e-4, 1e-2),\n",
    "        'l2':uniform(1e-4, 1e-3),\n",
    "        'dropout':uniform(.5,.7),\n",
    "        'batch_size':choice(16,32,64),\n",
    "        'hidden_units':choice(4,6)\n",
    "    }\n",
    ")\n",
    "\n",
    "termination_policy = BanditPolicy(slack_factor=.1, \n",
    "                                  evaluation_interval=1, \n",
    "                                  delay_evaluation=1\n",
    "                                 )\n",
    "\n",
    "hd_run_config = HyperDriveConfig(estimator=estimator,\n",
    "                                 hyperparameter_sampling=param_sampling,\n",
    "                                 policy=termination_policy,\n",
    "                                 primary_metric_name='recall',\n",
    "                                 primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,\n",
    "                                 max_total_runs=10,\n",
    "                                 max_concurrent_runs=5\n",
    "                                )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We submit the exepriment for execution and render the Run execution through the widget"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "experiment = Experiment(workspace=ws, name=EXPERIMENT_NAME)\n",
    "\n",
    "run = experiment.submit(hd_run_config)\n",
    "run"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "RunDetails(run).show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Model registration\n",
    "\n",
    "With the model training done and hyperparameters tuned, we save the best trained model found by hyperdrive based on the primary metric, we have selected."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "best_run = run.get_best_run_by_primary_metric()\n",
    "\n",
    "model = best_run.register_model(model_name='deep_pdm', model_path='outputs/network.pth')\n",
    "print(model.name, 'saved')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model operationalization\n",
    "\n",
    "\n",
    "We are now ready to operationalizing the model and deloying the webservice. For testing purposes, we wil use ACI to serve predictions.\n",
    "\n",
    "For More details on Model deployment workflow in Azure Machine learning service,click [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-deploy-and-where#deployment-workflow) \n",
    "\n",
    "The artifacts included in the image are under the score  directory. \n",
    "The listed below files are used as follow:\n",
    "\n",
    " - network.py: contains LSTM network definition \n",
    " - score.py: scoring file containing model loading and predictin serving\n",
    " - myenv.yml: contain python libraries needed by score.py"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Image creation\n",
    "\n",
    "Here, we instantiate an image configuration object and follow-up with Image creation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Image_configuration call require  current directory to be where score.py and dependencies reside\n",
    "\n",
    "os.chdir(SCORING_DIR) \n",
    "print(\"Switched current directory to\",os.getcwd())\n",
    "\n",
    "image_config = ContainerImage.image_configuration(execution_script = \"score.py\",\n",
    "                                                 runtime = \"python\",\n",
    "                                                 conda_file = \"myenv.yml\",\n",
    "                                                 dependencies = [\"network.py\"],\n",
    "                                                 description = \"Image of predictive maintenance model\",\n",
    "                                                 tags = { \"type\": \"lstm_classifier\"}\n",
    "                                                 )\n",
    "image = ContainerImage.create(name = \"dpm-image\", \n",
    "                              models = [model], \n",
    "                              image_config = image_config,\n",
    "                              workspace = ws\n",
    "                              )\n",
    "image.wait_for_creation(show_output=True)\n",
    "\n",
    "\n",
    "os.chdir(PROJECT_DIR)\n",
    "print(\"Reverted to root experiment directory\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Web service deployment\n",
    "\n",
    "With image built and published in the Azure container registry associated with our Azure machine learning workspace, we proceed with the deployment of the web service. for testing purposes, we opt for Azure container instance instead of Azure Kubernetes service cluster"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "aci_config = AciWebservice.deploy_configuration(cpu_cores=2, \n",
    "                                               memory_gb=2, \n",
    "                                               tags={\"type\":\"deep predictive maintenance\"}, \n",
    "                                               description='Predict equipment failure')\n",
    "\n",
    "service = Webservice.deploy_from_image(workspace=ws,\n",
    "                                       name=ACI_SVC_NAME,\n",
    "                                       deployment_config=aci_config,\n",
    "                                       image = image)\n",
    "\n",
    "service.wait_for_deployment(show_output=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Test Web service\n",
    "\n",
    "Finally, we score the test data set against the webservice we've just deployed,and we'll report peformance metrics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "testfile_path = os.path.join(PROJECT_DIR, 'data/preprocessed_test_file.csv')\n",
    "X,y,engine_ids = to_tensors(testfile_path, is_test = True)\n",
    "\n",
    "output_df = pd.DataFrame(columns = ['engine ID', 'prediction', 'likelihood'])\n",
    "\n",
    "for i,x in enumerate(X):\n",
    "    output =service.run(json.dumps({'input_data': x[np.newaxis,:].tolist()}))\n",
    "    output_df.loc[i] = [str(engine_ids[i]), float(output['prediction']),\n",
    "                        round(exp(output['likelihood']),2)]\n",
    "                                                                       \n",
    "output_df.T"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Test set performance\n",
    "\n",
    "Lastly, we report precision, recall and F1 performance metrics on the test data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_hat = output_df.prediction\n",
    "\n",
    "print(\"Precision:\",round(precision_score(y, y_hat),2))\n",
    "print(\"Recall:\",round(recall_score(y, y_hat),2))\n",
    "print(\"F1:\",round(f1_score(y, y_hat),2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tear down resources\n",
    "\n",
    "Now that we're done, we delete the ACI deployment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "service.delete()"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [conda env:amlenv]",
   "language": "python",
   "name": "conda-env-amlenv-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
